منابع مشابه
Developing a Persian Part of Speech Tagger
Assigning grammatical categories to words in a text is an important component of a natural language processing (NLP) system. Corpora tagged with Part of speech (POS) information are often used as a prerequisite for more complex NLP applications such as information extraction, syntactic parsing, machine translation or semantic field annotation. They are also used to help train statistical models...
متن کاملA Statistical Part-of-Speech Tagger for Persian
This paper presents the statistical part-ofspeech tagger HunPoS trained on a Persian corpus. The result of the experiments shows that HunPoS provides an overall accuracy of 96.9%, which is the best result reported for Persian part-of-speech tagging.
متن کاملA Persian Part-Of-Speech Tagger Based on Morphological Analysis
This paper describes a method based on morphological analysis of words for a Persian Part-Of-Speech (POS) tagging system. This is a main part of a process for expanding a large Persian corpus called Peyekare (or Textual Corpus of Persian Language). Peykare is arranged into two parts: annotated and unannotated parts. We use the annotated part in order to create an automatic morphological analyze...
متن کاملImplementing an Efficient Part-Of-Speech Tagger
An efficient implementation of a part-of-speech tagger for Swedish is described. The stochastic tagger uses a well-established Markov model of the language. The tagger tags 92% of unknown words correctly and up to 97% of all words. Several implementation and optimization considerations are discussed. The main contribution of this paper is the thorough description of the tagging algorithm and th...
متن کاملRule Based Hindi Part of Speech Tagger
Part of Speech Tagger is an important tool that is used to develop language translator and information extraction. The problem of tagging in natural language processing is to find a way to tag every word in a sentence. In this paper, we present a Rule Based Part of Speech Tagger for Hindi. Our System is evaluated over a corpus of 26,149 words with 30 different standard part of speech tags for H...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computer Systems Science and Engineering
سال: 2020
ISSN: 0267-6192
DOI: 10.32604/csse.2020.35.423